On the Integration of Optical Flow and Action Recognition
نویسندگان
چکیده
Most of the top performing action recognition methods use optical flow as a “black box” input. Here we take a deeper look at the combination of flow and action recognition, and investigate why optical flow is helpful, what makes a flow method good for action recognition, and how we can make it better. In particular, we investigate the impact of different flow algorithms and input transformations to better understand how these affect a state-of-the-art action recognition method. Furthermore, we fine tune two neural-network flow methods end-to-end on the most widely used action recognition dataset (UCF101). Based on these experiments, we make the following five observations: 1) optical flow is useful for action recognition because it is invariant to appearance, 2) optical flow methods are optimized to minimize end-point-error (EPE), but the EPE of current methods is not well correlated with action recognition performance, 3) for the flow methods tested, accuracy at boundaries and at small displacements is most correlated with action recognition performance, 4) training optical flow to minimize classification error instead of minimizing EPE improves recognition performance, and 5) optical flow learned for the task of action recognition differs from traditional optical flow especially inside the human body and at the boundary of the body. These observations may encourage optical flow researchers to look beyond EPE as a goal and guide action recognition researchers to seek better motion cues, leading to a tighter integration of the optical flow and action recognition communities.
منابع مشابه
Transfer from action to perception: The effect of motor-perceptual enrichment
This study investigated the effect of audiovisual integration on action-perception transfer.40 subjects were randomly divided four groups: visual, visual-auditory, control visual and control visual-auditory. Visual groups watched pattern skilled basketball player and other groups in addition to watching pattern skilled basketball player, heard Elbow angular velocity as sonification. In first st...
متن کاملAction Change Detection in Video Based on HOG
Background and Objectives: Action recognition, as the processes of labeling an unknown action of a query video, is a challenging problem, due to the event complexity, variations in imaging conditions, and intra- and inter-individual action-variability. A number of solutions proposed to solve action recognition problem. Many of these frameworks suppose that each video sequence includes only one ...
متن کاملIntegration of Deep Learning Algorithms and Bilateral Filters with the Purpose of Building Extraction from Mono Optical Aerial Imagery
The problem of extracting the building from mono optical aerial imagery with high spatial resolution is always considered as an important challenge to prepare the maps. The goal of the current research is to take advantage of the semantic segmentation of mono optical aerial imagery to extract the building which is realized based on the combination of deep convolutional neural networks (DCNN) an...
متن کاملApplication of Combined Local Object Based Features and Cluster Fusion for the Behaviors Recognition and Detection of Abnormal Behaviors
In this paper, we propose a novel framework for behaviors recognition and detection of certain types of abnormal behaviors, capable of achieving high detection rates on a variety of real-life scenes. The new proposed approach here is a combination of the location based methods and the object based ones. First, a novel approach is formulated to use optical flow and binary motion video as the loc...
متن کاملCombining Spatio-Temporal Appearance Descriptors and Optical Flow for Human Action Recognition in Video Data
This paper proposes combining spatio-temporal appearance (STA) descriptors with optical flow for human action recognition. The STA descriptors are local histogram-based descriptors of space-time, suitable for building a partial representation of arbitrary spatio-temporal phenomena. Because of the possibility of iterative refinement, they are interesting in the context of online human action rec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1712.08416 شماره
صفحات -
تاریخ انتشار 2017